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Abstract 

I explain a direct approach to differentiation and integration. Instead 
of relying on the general notions of real numbers, limits and continuity, 
we treat functions as the primary objects of our theory, and view differ- 
entiation as division of f{x) — f{a) by a; — a in a certain class of functions. 
When / is a polynomial, the division can be carried out explicitly. To see 
why a polynomial with a positive derivative is increasing (the monotonic- 
ity theorem), we use the estimate \f{x) — f{a) — f'{a){x — a)\ ^ K{x — a)^. 
By making it into a definition we arrive at the notion of uniform Lipschitz 
differentiability (ULD), and see that the derivative of a ULD function is 
Lipschitz. Taking different moduli of continuity instead of the absolute 
value, we get different flavors of calculus, each rather elementary, but 
all together covering the total range of uniformly differentiable functions. 
Using the class of functions continuous at a, we recapture the classical 
notion of pointwise differentiability, ft turns out that uniform Lipschitz 
differentiability is equivalent to divisibility of f{x) — /(a) by x — a in the 
class of Lipschitz functions of two variables, x and a. The same is true for 
any subadditive modulus of continuity. In this bottom-up, computational, 
one modulus of continuity at a time approach to calculus, the monotonic- 
ity theorem takes the central stage and provides the aspects of the subject 
that are important for practical applications. The weighty ontological is- 
sues of compactness and completeness can be treated lightly or postponed, 
since they are hardly used this streamlined approach that pretty much fol- 
lows the Vladimir Arnold's "principle of minimal generality, according to 
which every idea should first be understood in the simplest situation; only 
then can the method developed be applied to more complicated cases." I 
discuss a generalization to many variables briefiy. 



1 Two Stories, One Fictional, One Real 
1.1 Differentiating without using limits 

A teacher asks a student to calculate the derivative of x'^ at x = a. The student 

4_ 4 

writes down the difference quotient , then, by factoring the numerator, 
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rewrites it as "•'^'^^"^^'^ ^ , then cancels x—a and gets x+a, then substitutes 
X = a and gets 4o^, that is the right answer, of course. The teacher does not 
like the solution, and the following conversation takes place. 

T: Your answer is correct, but why didn't you use the definition of the 
derivative as a limit? We are studying calculus here, you know. 

S: Do I really need to use limits? It looks like a waste of time, I can just 
simplify and plug in x = a instead, it looks like it works fine. 

T: But do you understand why it works? 

S: Hmmm, let me see. I guess it works because the limit of (x + a){x'^ + a^) 
as .T ^ a is Aa^, so, instead of calculating the limit we can just plug x = a into 
{x + a){x'^ + a^). 

T: How do you call such a function, that you can just plug in a; = a into it 

instead of calculating the limit of this function at a? 
S: Continuous at a? Yeah, I remember. 

T: Right! You know, people differentiated polynomials, roots and trig func- 
tions in the 17th century, long before they started thinking of such generalities 
as continuity and limits in the 19th century. Why don't you try to differentiate 
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your way some other simple algebraic expressions, such as y'x or ? 
S: O.K., I will, I think I understand it a little better now. 



1.2 Differentiating ^/x without using limits 

It happened in the fall of 1997, when I taught two calculus recitation sections at 
Suffolk University. The purpose of these sections was to answer the questions 
the students had about their homework and the subject in general. The text 
was Anton's Calculus, which I came to hate as the semester progressed. 

It was one of the classes, and some students asked me to explain how to dif- 
ferentiate ^/x. So I wrote down the difference quotient ^~^ on the chalkboard 
and said that we had to calculate the limit of this expression as x approaches a. 

As soon as I uttered the word "limit" I saw many students slump in their 
seats, their eyes glazing over, and I had the sinking feeling that they were totally 
lost. I had to do something fast to help them, to pull them out of their despair, 
but what? 

I said, look, you don't really need limits to calculate this derivative, you 
can do it algebraically. Let us rewrite this expression in such a way that it 
would make sense for x = a. How can we do that? Let us rewrite the de- 
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nominator as ^/x — ^/a and factor it as {^/x — y/a){^/x + \/a), so "^^"^^ = 
Tn= — — ^ = r-] r- that makes sense for x = a, giving us the answer 
(v^)' = l/{2y/x), that's all there is to it. 

I saw the students brightening up a little bit, when they realized that the 
problem could be solved with the tools familiar to them. And that's exactly 
when it dawned on me that all calculus could be done like that, differentiation 
being nothing but division in the class of continuous functions. It surely looked 
like a promising idea. 
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2 Calculus of Polynomials 



2.1 Formal differentiation 

Let us start with the simplest and most popular example, differentiating . 
We form the difference qiioticnt '^.^.^^ and try to make; sense of it for x = a. 
The trouble is, of course, that when we just plug in a; = a, we get 0/0, which is 
undefined, because Oc = for any number c. But luckily, the numerator factors 
as {x — a) {x + a) , so we can cancel x — a and rewrite our expression as x + n that 
makes sense for x = a, giving us (x^)' = 2x. To generalize to x", we use the 
factorization x" — a" = (x — a)(x"~-^+x"~^a+ + a"~^) to get (x")' = nx^~^. 

This trick will work for any polynomial p(x), because a is a root of the poly- 
nomial ]3(x) —p(a), and therefore it is divisible by x — a, so we have p(x) — j3(a) = 
(x — a)q{x, a), and we can rewrite ^^'^^-a''"'' ^ which is a polynomial in 

X and a and therefore makes sense for x = a, giving us p'{x). 

Of course we don't have to divide polynomials every time we differentiate 
them. The first two differentiation rules tell us that (f+g)' = f'+g' and (kf)' = 
kf for any constant k, in other words, differentiation is a linear operation, and 
therefore we can differentiate polynomials "term by term," i.e. 

{PO +PlX + ... +PnX"'y =Pl+ 2p2X + ... +nPnX"-~'^. 

The other two rules of differentiation, the product (or Leibniz) rule, saying that 
{fg}' = f'g + fg' and the ciain rule by Newton, {f{g{x)))' = f'{g{x))g'{x) are 
a matter of algebra of polynomials. 

The trick developed here can be used to differentiate all rational functions, 
and even algebraic functions that are defined implicitly by algebraic equations, 
if we use implicit differentiation. 

2.2 Double roots and the basic estimate 

Consider a polynomial p{x).Th.e question is: "why the tangent to the graph 
y = p{x) at the point (a,p(a)), which is the line defined by the equation y = 
p{a) +p'{a){x — a) looks like a tangent, i.e. "clings" to this graph?" Let us 
start with a simple example, p{x) = x*^. Then p'{a) = ka''~^, and x'^ — a'^ — 
ka^~'^{x-a) = (x-a)(x'^~-^+x'^~^a + . . . + a^~'^ -ka!^~^) = {x-a)^r{x,a), with 
r a polynomial in x and a, because the second factor vanishes for x = a, so it is 
divisible by X — a. A similar factoring, p(x)—p(a)—p'(a)(x — a) = (x — a)^r(x, a), 
holds for any polynomial p since it is a sum of monomials. It shows that x = a 
is a double root of the equation p{x) — p(a) — p'{a){x — a) = 0. This fact can 
be taken as the definition of a tangent to a graph of a polynomial, and can 
be used to define the derivative for polynomials. The vertical distance d{x, a) 
between the graph and the tangent can be written as (x — a)'^\r{x,a)\, with r 
a polynomial in x and a. When x and a are contained in some finite interval, 
|r(x, a) I will be bounded from above by some constant K, giving us an estimate 
d{x,a) < K{x — a)^. This basic estimate, that also can be written as 

\p{x) - p{a) - p'{a){x - a)\ < K{x - af (1) 
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holds for any polynomial p, and explains why tangents clings the graphs. We 
will use it in the next subsection to understand why a polynomial with a positive 
derivative is increasing. 

2.3 Monotonicity principle 

The derivative is a mathematical metaphor for the instantaneous velocity, or 
the instantaneous rate of change of a function relative to its argument. So 
we would expect that a function with positive derivative would be increasing. 
Let lis sec why it is true for polynomials. Assume that p'{x) ^ for any x 
such that A ^ X ^ B. We want to show that p{A) ^ p{B). We can deal 
with a simpler case p'{x) ^ C > first. Our basic estimate (1) tells us that 
p{x) — p{a) ^ p'{a){x — a) ~ K{x - a)^, so p{a) ^ p{x) ii < x — a ^ C/K. 
Therefore, p{A) ^ p{B), since wc can get from Ato B by steps shorter than 
C/K. To get to the original assumption, wc can consider q{x) = p{x) + Cx with 
C > and conclude that p{B) - p{A) > C{A - B), therefore p{A) < p{B) since 
C is arbitrary. 

By applying our monotonicity principle to / + Mx and / — Mx, we can 
demonstrate 

Corollary 1 The Rule of Bounded Change. 
If \p'\ ^ M, then \p{x) - p{a)\ < M\x - a\. 

When wc look at definite integrals as increments of anti-derivatives, we can 
see how monotonicity is related to positivity of the area. 

2.4 Formal integration 

It can be introduced before the basic estimate is treated and monotonicity the- 
orem is demonstrated, and it is very easy for polynomials. Besides, it provides 
a strong evidence for the Newton-Leibniz theorem. The simplest examples of 
course are the constants and the linear functions. A bit more work is required 
to calculate the areas under the other power curves, and may give the skep- 
tics an opportunity to use such tools as algebra, the geometric series and even 
combinatorics (to estimate! the sum 1* + 2*^' + . . . -I- n''). Newton-Leibniz is very 
intuitive and can be explained early on. The integration rules are just the rules 
of differentiation, rewritten in terms of integrals. This formal theory can be 
used right away to solve some interesting problems in geometry and physics. 

3 Uniform Lipschitz Calculus 

How can we extend our calculus to functions more general than polynomials? 

As it often happens in mathematics, wc just look at some useful property or a 
formula and make it into a definition (think about the Pythagorean Theorem). 
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The useful property here wih be the basic estimate (1) from section 2.2, so we 
call a function / uniformly Lipschitz difFerentiable (ULD) if the estimate 

\f{x) - f{a) - }'{a){x - a)\ ^ K{x - af (2) 

holds for some constant K independent of x and a. 

Now we can prove our monotonicity theorem from section 2.3 for ULD func- 
tions. 



3.1 The automatic Lipschitz estimate for the derivative 

We know that the derivatives of polynomials are also polynomials. What would 
be the analogous fact for ULD functions? It turns out that theis derivatives 
are Lipschitz, i.e., they satisfy the estimate \f'{x) — f'{a)\ ^ L\x ~ a\ with L 
independent of x and a. 

To see it, we notice that for a; 7^ a | '^^^ilg''""' ~ /'('^)l ^ -^1^ ~ '^l- By 
interchanging x and a we get I^^^^^E^^ — /'(x)| ^ K\a — x\. but ^'^°^~J}^^ = 

and we see that \ f(x) - f(a)\ <k 2K\x - a\, i.e., /' is Lipschitz. 
Of course all the polynomials are Lipschitz on any finite interval, because 
x — a is a factor in p{x) — p{a), and the ULD functions are too, because their 
derivatives are bounded on any finite interval, and we get \f{x)—f{a) \ ^ M|x— a| 
from the rule of bounded change. As we will see later (for general moduli of 
continuity), the analogy runs even deeper, and in fact differentiation of ULD 
functions is related to factoring in the class of Lipschitz functions the same way 
as differentiation of polynomials is related to their factoring. 



3.1.1 A comparison vifith the non-standard analysis approach 

In non-standard analysis the derivatives of functions differentiable on a hyper- 
real interval are automatically continuous, the proof goes the same way, except 
we say that ^^'^lz{}'^^ ~ /'(a) and — f'{x) are infinitely small when x — a 

is, and conclude that in this case f (x) — f'{a) is infinitely small too. It is this 
fact that makes the non-standard approach to calculus simple. More generally, 
many pointwise estimates on a hyperreal interval are in fact uniform. In uni- 
form differentiation theory we work with uniform estimates directly and get the 
results much cheaper, without any infinitesimals that are not constructive. See 
|http://en.wikipedia.org/wiki/IIyperreal_number where I wrote a section "An 
intuitive approach to the ultrapower construction" and references there. 



3.2 Integration, existence of a primitive and Newton-Leibniz 

It is easy to integrate polynomials and rational functions since antiderivatives 
can be written down explicitly in terms of the elementary functions, but this 
situation is rather exceptional. Now, we know that the derivative of any ULD 
function is Lipschiz, and we ask if an antiderivative exists for any Lipschitz func- 
tion, in what sense it exists, and how it can be calculated. The idea is to define 
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the definite integral as the area under the graph and then to make sense out 
of the notion of the area by constructing explicit approximations (pretty much 
following the approach of the Greeks, later developed by Riemann, Darboux, 
Jordan, and Lebesque), and then prove Newton-Leibniz. The case of Lipschitz, 
and other uniformly continuous functions, is particularly simple, and requires 
hardly any sophistication. A picture (that is worth a 1000 words) is available 
on page 13 at http://www.mathfoolery.org/talk-2004.pdf and pages 43-44 at 
http://www.mathfoolery.org/lathead.pdf with a proof of Newton-Leibniz. 



4 Other Moduli of Continuity 

Sometimes calculus based on Lipschitz estimates is too restrictive, for example, 
the function x^^"^ has y/x for the derivative, which is not Lipschiz, since it grows 
too fast near x = 0. To treat this function as differentiable, we can relax the 
estimate (2) defining differentiability to |/(x)— /(a) — /'(a)(a;— a)| ^ K\x—a}^l'^. 
More generally, we can use the inequality 

|/(x) - /(a) - r{a){x - a)\ < K\x - a\m{\x - a\) 

with some modulus of continuity m to define m-difFerentiability, rn(a;) = y/x 
is an example, for any positive 7 ^ 1, x'*' is a more general example, the cor- 
responding differentiability is called uniform Holder, with the exponent 7 and 
the corresponding derivatives are Holder, i.e., \f'(x) — /'(a)| ^ H\x — a\'^ holds. 
In general, we want m to be defined for a: ^ 0, an increasing, continuous at 0, 
m(0) — 0, and subadditive, i.e., m{x + y) ^ ■m{x) + m{y). All the Lipschitz 
theory extends to the general moduli of continuity with some obvious modifi- 
cations, the derivatives are m-continuous, i.e., \f'{x) — f'{a)\ ^ Km{\x — a\) 
etc. 



4.1 An estimate of the difference quotient 

Let TO be a subadditive modulus of continuity, in particular, to is increasing, 
defined for re ^ 0, and m{x)/x is decreasing for x > 0, and let / be a uniformly 
m - differentiable function, i.e. there is a uniform in x and a estimate with some 
constant K: 

\f{x) - /(a) - f'{a){x - a)\ K\x - a\m{\x - a\) (3) 
Let the difference quotient for / be the 2-variable function 

Qf{x, a) = {f{x) — f{o.))/ {x ~ a) for x ^ a &r\AQ j{x, x) — f'{x). 

We want to demonstrate the inequality 

\Qf{x,a)^Qfiy,a)\^2Km{\x^y\), (4) 
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that means that the difference quotient is a uniformly m - continuous. That 
will justify the idea that uniform differentiation is factoring in the class of m - 
continuous functions of 2 variables. 

Because only the increments of the independent variable and the correspond- 
ing increments of the values of / are involved in the difference quotient, we can 
assume a = = /(O) and the inequality we want becomes 

\f{x)/x-f{y)/y\^2Km{\x^y\). (5) 

The case a; < < y is easy because \ f{x)/x — /'(0)| < Km{\x\) and \f{y)/y — 
/'(0)| < Km(\y\), so \f(x)/x - f(y)/y)\ < K{m{\x\) + m{\y\)) < 2Km{\x - y\) 
because rn is increasing. 

The case of x and y of the same sign, say, < a; < y is a bit more delicate. 
First we notice that adding any linear function to / does not change f{x)/x — 
f{y)/y. so we can assimie that f'{x) — 0. The left-hand side of the inequality 
we want to prove can be rewritten as \ {{y — x)f{x) — x{f{y) — f{x)))/{xy)\. Now, 
\f{y) — f{x)\ < K{y — x)m{y — x) because f'{x) = 0, and also |/(x)| < Kxm{x) 
because /(O) = 0. So it is enough to show that ^^-^m,(x) ^ m,{y — x:). Again the 
case X ^ y — x is easy because m is increasing. We only have to use subadditivity 
of m when y — x ^ x. In this case m{x)/y < m{x)/x < m{y — x)/{y — x) and 
we are done. 

4.2 Epsilon-delta and moduli of continuity 

We used different moduli of continuity to describe uniform continuity and differ- 
entiability. The question is: "how much of the classical theory of continuous and 
smooth functions do we miss, if any?" The answer to this question is "nothing." 
Let us consider uniform continuity, uniform differentiability is analogous. 

The classical way to describe uniform continuity of a function / is to say 
that for any e > there is (5 > such that \f(x) — f{a)\ < e when \x — a\ < S. 

We want to show that there is a modulus of continuity m, such that the 
inequality |/(a;) — /(a)| < m{\x — a\) holds. Let us consider the following 
function: g{h) = sup{|/(a;) — /(a)| : |x — a| ^ h}. We know that g will be 
positive, increasing, and g{h) — > as h — > 0, so g will become continuous at 
if we put g{0) = 0. Now, on {h, y) plane consider the set {{y, h) : y ^ 9{h)} of 
points under the graph of g. Take the convex hull of this set. The upper edge 
of this convex hull will be the graph of a concave (and therefore subadditive) 
modulus of continuity for /. 

It is needless to say that in some questions (such as topological classification 
of dynamical systems) keeping track of the particular moduli of continuity may 
be a nuisance, and not fruitful. Then we can throw all the uniformly continuous 
or uniformly differentiable functions into one big pile and enjoy the generality. 
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5 Some Pedagogical implications 

5.1 Calculus by problem solving: a still unrealized dream 

http: / /www.mathfoolery.org/Problem^ets /hw.html 

6 Many Variables 

6.1 Differentiability 

Similar to the case of one variable, we define differentiability by the inequality 

\f{x + h)- fix) - ,nx)h\ K\h\m{\h\). 

Here |.| denotes some norm, for example, the Euclidean norm, f'{x) is a linear 
map depending on x, K is a constant and m is a modulus of continuity. 

6.2 Automatic continuity of the derivative 

We want to show that the uniform derivative is uniformly continuous with the 
modulus of continuity m from the definition, i.e., the inequality 

\f'{x + h)-f'{x)\^Lm{\h\) 

holds for some constant L that will depend on K in the definition. Here |.| is 
the norm of the linear operators, |^| = sup{|Afc|, = 1}. 

The idea of the simplest proof I could come up with is the following. There 
are two ways to get from x to x + h + k. We can go directly, or we can go from 
X to X + h first and then from x + h to x + h + k. The corresponding increments 
of the function / should be the same. Now consider the approximation of these 
increments by the differentials. 

\f{x + h)- fix) - f'{x)h\ K\h\m{\h\) 

\f{x + h + k)- f{x + h) - fix + h)k\ s$ K\k\mi\k\) 

\- f{x + h + k) + fix) + f'ix)ih + fc)| ^ K\m + k\mi\h + k\) 

By "adding" all of these inequalities and using the triangle inequality, |a + fe| ^ 
\a\ + |6|, and linearity, /'(a;)(/i + fc) = f'ix)h + /'(x)fc, we conclude that 

\f'ix)k - fix + h)k\ ^ Ki\h\mi\h\) + |fc|m(|A:|) + \h + fc|m(|/i + k\)). 

But + < 1^1 + 1^1 and mi\h + k\) < to(|/i| + |fc|) ^ m(|/i|) + m(|fc|) (triangle, 
m is increasing and subadditive). Finally, by taking |fc| = we get 

\ifix + h)- fix))k\ = \fix)k - fix + h)k\ ^ 6Kmi\h\)\k\ 

that means that \fix + h) — fix)\ ^ 6Kmi\h\), so we can take L = 6K. Done. 
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6.3 The equality of the mixed derivatives 

Probably the simplest way to understand why f^y = fy^ is to use the Green's 
formula. Here is how. Let us consider a rectangle ABCD on the plane 
where / is defined. A = (a, c), B = {b, c), C = (b, d) and D = (a, d). There are 
two ways to get from A to C. We can go from A to B and then from B to C, 
or we can go from A to D and then from D to C. The total change in / should 
be the same for both ways. Let us write down this change in terms of the line 
integrals of the partial derivatives. 

f{A) - f{C) = f{B) - f{A) + f{C) - f{B) = f Ux, c)dx + f fy{b, y)dy 

J a J c 

and on the other hand, 



f{A) - f{C) = f{D) - f{A) + f{C) - f{D) = f fy{a, y)dy + f /,(x, 

J c J a 



d)dx 



so we have 



/ {fx{x, d) - fx{x, c))dx - / {fy{b, y) - fy{a, y))dy = 0, 

but fx{x, d) - fx{x, c) = fxyix, y)dy and fy{b, y) - fy{a, y) = fyxix, y)dx 
and, replacing the iterated integrals with the double integrals, we conclude 
that ^ABCD^^^y " fyx)dxdy = for any rectangle ABCD. It is only possible if 
fxy - fyx = 0, so fxy = fyx and we are done. 
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